Competition in Social Networks: Emergence of a Scale-free 
Leadership Structure and Collective Efficiency 
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Using the minority game as a model for competition dynamics, we investigate the effects of 
inter-agent communications on the global evolution of the dynamics of a society characterized by 
competition for limited resources. The agents communicate across a social network with small-world 
character that forms the static substrate of a second network, the influence network, which is dy- 
namically coupled to the evolution of the game. The influence network is a directed network, defined 
by the inter-agent communication links on the substrate along which communicated information is 
acted upon. We show that the influence network spontaneously develops hubs with a broad dis- 
tribution of in-degrees, deflning a robust leadership structure that is scale-free. Furthermore, in 
realistic parameter ranges, facilitated by information exchange on the network, agents can generate 
a high degree of cooperation making the collective almost maximally efficient. 
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In a competitive environment with seriously limited 
resources, an individual will be able to make the most 
gains, if he avoids the crowds, and finds strategies that 
places him into the distinguished class of the elites, or 
of the "few" . Even though this class forms a minority 
group when compared to the whole agent society, it can 
largely influence the dynamics of the entire society for 
the simple reason that the elites hold the best strategies 
in the given situation, and thus they become key target 
nodes for others to communicate with, and follow. For 
our purposes, an agent is a leader if at least one agent is 
following, and thus acting on his advice. The influence of 
a leader is measured by the number of followers he has. 
Agents who are not leaders are simply coined "follow- 
ers" . However, leaders can follow other leaders, thereby 
creating a leadership structure. Certainly, the leadership 
structure, and even which particular agents are leaders 
at all, is often very dynamic (mostly because the success 
of a certain strategy is determined by the context of the 
strategies used by the other agents). 

One of the most ubiquitous mechanisms guiding peo- 
ple in deciding whom, or what to follow is reinforcement 
learning^, which is a mechanism for statistical infer- 
ence created through repeated interactions with the en- 
vironment. For example, in iterated situations/games, it 
can be argued that we all monitor our social circle, and 
"score" our acquaintances, including ourselves, based on 
past performance (success measure). We then take more 
seriously, and often follow those with a higher score (suc- 
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cess rate)0- 

In order to study the scenario described above, in this 
Letter we use a well known multi-agent model of competi- 
tion, the Minority Game|l|i,|i| (MG), which we modify 
to include inter- agent communications/influences across 
a social network. The two main questions we address 
here are: 1) What type of leadership structure is gener- 
ated? and 2) Can the effects of inter-agent communica- 
tions aggregate up to the level of the collective and affect 
its behavior? 

The original MG is an abstraction of a market played 
by agents with bounded rationality, inspired by the El 
Farol bar problem introduced by Brian W. Arthur[^. In 
this iterated game, at every step, N agents must choose 
between two different options, symbolized by A and B, 
e.g. "buy" and "sell" . Only agents in the minority group 
get a reward. The agents have access to global informa- 
tion, which is the identity of the minority group for the 
past TO rounds. Each agent bases his choice on a set of 
S strategies available to them. A strategy, which is an 
agent's 'way of thinking', is a prediction 'u\ for outcome 
A or B, in response to all possible histories of length to. 
The strategies are distributed randomly among agents, 
and thus in general each agent has different set of S 
strategies. They make their next choice in the game us- 
ing reinforcement learning: every agent keeps a score for 
each of the S strategies which he then increments by one 
each round if that strategy correctly predicted the mi- 
nority outcome (regardless of usage) . The strategy used 
to make the new choice is the one with the best score up 
to that time. If two, or more strategies share the best 
score, then one of those strategies is picked randomly. 
Previously, the effects of local information in the MG 
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were studied both with reinforcement lear ning type |7| 
and non-reinforcement learning type d, 0, of agent 
communication mechanisms on Kauffman networks 
and with non-reinforcement learning type of mechanisms 
on linear chains P.lloj. 

In our model, a social network of agents is described by 
a graph with vertices representing the agents, and edges 
representing acquaintanceship between pairs of agents. 
This network of acquaintances forms the substrate net- 
work (G), or skeleton for inter-agent communications 
H nil, la E [l3- An edge ab in G means that agents 
a and b may exchange game-relevant information. How- 
ever, it does not indicate whether the exchanges influence 
the action by any of the involved agents. That informa- 
tion is modeled by a second network, the influence net- 
work (F), which is a directed subset of G, and in which 
an edge ab, pointing from a to b, means that agent a 
acts on the advice of agent 6 when deciding the minor- 
ity choice. In the competitive environment of the stock 
market, Kullman, Kertesz and Kaski, by studying time- 
dependent cross-correlations have recently shown the ex- 
istence of such a directed network of influence among 
companies(F) based on data taken from the New York 
Stock Exchange We do not, in general, know the 

precise topology of the social networks. However, it is 
known that social networks have a small-world character 
El El El- Here we take G to be an Erdos - Renyi (ER) 
random graph with link probability p. An ER random 
graph shows the small world effect, since the diameter of 
the graph increases only logarithmically with the number 
of vertices E3 ^^'^ nodes also have a well defined av- 
erage degree, pN, which results from cognitive limitation 
[13. Studies using other types of network topologies, 
which are more suited to describe social networks (one 
drawback of ER is its low clustering coefHcient E3) '^i^l 
be presented in future publications. Just as in the origi- 
nal MG, in our model, in order to make his next decision, 
each agent uses his best performing strategy to predict 
what the next minority choice will be. However, he does 
not necessarily act on that prediction. Instead, the pre- 
diction simply constitutes the agent's opinion, which he 
then shares with all his flrst neighbors on the substrate 
network G. This is done by all agents simultaneously, 
and thus every agent obtains as information the predic- 
tions of all their first neighbors. Each agent then uses this 
information to make their final choice, via reinforcement 
learning: they keep scores of the prediction performance 
of all their first neighbors and themselves, and update 
the scores after every round by incrementing the scores 
of the agents whose prediction was correct. Each agent 
then acts on the prediction/opinion of the neighboring 
agent with the highest score. Of course, if they have a 
higher score than any of their neighbors, then they act 
on their own prediction. 

The game is initialized by fixing at random S strate- 
gies for each agent, an arbitrary initial history string, 



and a fixed instance of the substrate network G. After 
many iterations, the game evolution becomes insensitive 
to the particular initial history string. However, it may 
remain sensitive to the quenched disorders in the strat- 
egy space of the NS strategies that are used, and in the 
quenched disorder associated with the particular social 
network chosen. Thus, there are four relevant parameters 
in this game: N, S, m, and p G [0, 1]. Of course, in real- 
ity the substrate network can also change (we make new 
friends and others fade away). However, we assume its 
dynamics to be much slower than that of F, and therefore 
it is neglected here. As defined previously, an agent i is 
a leader if it has at least one follower, j, and thus agent j 
follows through action what agent i suggests. For this to 
happen, i has to have the largest prediction score among 
the acquaintances of j, which are defined as the kj edges 
j has in G. In an ER graph, the number of kj links has 
a Poisson distribution with an average value at A = pN , 
and an exponential tail. An agent j will follow only one 
agent's opinion to decide his action, and thus its num- 
ber of out-links is always one, fcj""*' — 1. However, the 

number of in- links for agent j, fcj"'\ can be any number 
between and kj , according to the number of agents act- 
ing on his advice. FigQ shows the in-degree distribution 
for various numbers of agents N, network connectivity 
p, and memory length m. The first striking observation 
from Fig^) is that over a wide range of parameters the 
in-link distribution is described by a power-law with a 
sharp cut-off. Thus, the average number of leaders with 
k followers, Nk, is a scale-free distribution IgJ. This hap- 
pens in spite of the fact that the substrate network, which 
is an ER graph is not a scale free network, and therefore 
it was not introduced a priori into the underlying struc- 
ture. The scale-free character of the influence network 
F is selected for by the reinforcement learning nature of 
the agent-agent interaction rules. The fact that a broad 
scale-free structure is selected on the back of a Poisson 
distributed network, seriously limits the size of the lead- 
ership. Indeed, Fig^i), which shows the non-leaders, or 
followers, expresses this fact: the pure followers consti- 
tute over 90% of the population for the cases presented 
in FigUJi). 

Plotting Nk/Ni, all the curves can be collapsed in 
the scaling regime up to their cut-offs, indicating that 
Nk{N,m;p) oc k~'^ Ni{N,m;p). The power of the de- 
cay, P is very close to unity, which means that kNk is 
independent of k and the other parameters in the scaling 
regime. Since k is the influence of a leader with k follow- 
ers, kNk represents the total influence of the fc-th layer 
in the leadership hierarchy. The above observation there- 
fore means that all layers of the hierarchy are equally in- 
fluential; influence is evenly distributed among all levels 
of the leadership hierarchy. This result is robust, and in- 
sensitive to the particular parameters, even in the low m 
(memory) regime. Here, however, oscillations build up 
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FIG. 1: Leaders and followers, a) and b) show the average 
of the number of leaders with k followers normalized by the 
average number of leaders with exactly one follower A^'i . The 
symbols correspond to varying system sizes and link proba- 
bilities, p = 0.1 and p = 0.2, respectively, while the dashed 
and thin continuous lines correspond to the same quantity for 
the Random Choice Game on the ER substrate. Next to the 
curves, the thick continuous line has a slope of -1. b) shows 
the same quantity for small memories, m — 2 and m = 4 
with S — 2 for p = 0.1 and p — 0.2. The curves oscillate 
around the same 1/k law. For all curves in a) and b) the 
averages were taken over 17 runs, which was sufficient, due to 
the strong self-averaging property of the quantities, c) shows 
that a{p) = Ni with good approximation is independent on 
the system size A'^. d) represents the number of followers as 
a function of the system size A''. Both for a) and b), m = 6 
and S = 2. 

around the 1/k behavior which still serves as a backbone 
for the leadership structure, but it becomes less obvious 
as m is decreased, see Fig^D). Another important ob- 
servation is that Ni{N, m;p) depends strongly only on p 
and not on N or m, thus Ni{N,m;p) = a{p), as shown 
in Fig. Ic). Therefore, we have 

Nk{N,m;p) = fc-''a(p)/fe(iV, m;p). (1) 

The fact that Ni{N ,m\p) is virtually independent of A^, 
means that if the number of agents is increased, the lead- 
ership structure and size in the scaling regime will not 
change! What changes though, is the number of the 
"sheep" or followers, which is A^o- It will grow in pro- 
portion to A^, as seen in Fig^). Also, the cut-off at the 
high-fc end of the distribution will occur at larger k as N 
is increased. The deviation of the function fk{N,m;p) 
from a constant accounts for the fluctuations in the lead- 
ership structure which vanish (the fluctuations) with in- 
creasing 771. This is due to the fact that the strategy space 
suffers a combinatorial explosion as m is increased (there 
are in total 2^"* strategies), and the agents' strategies 
therefore become highly uncorrelated |j,|j,|5|. 

This suggests that the results for large m can be repro- 
duced if the agents simply play a Random Choice Game 



( RCG) on the network. In a RCG, agents do not use 
strategies, but instead just toss a coin when making pre- 
dictions. Indeed, Fig. la) shows that the RCG on the 
ER network produces the same scale - free backbone of 
the leadership structure. Thus, in our model the close- 
ness to the scale- free backbone is determined by the level 
of mutual de-correlation of agents' strategies. This is to 
say that increased trait diversity (strategy space) leads 
to stable scale-free leadership structure. 

Although the leadership structure is stable for large 
m, the position of an individual agent in the leadership 
hierarchy is not. By computing the time correlations 
present in the number of in-links we can show that the 
average lifetime of an agent in a particular leadership 
position is short for large m, as detailed in Ref. 
contrast, at low m values, leaders become frozen in their 
positions. In other words, in the low m regime, where 
trait diversity is small, as in a dictatorship, where agents' 
action space is severely limited, leaders "live" longer in 
their positions. 

Next, we briefly study the global performance of the 
collective on the network. Consider choice A as the ref- 
erence option, and denote by A{t) the attendance, or the 
number of agents choosing option A at time t. One of 
the most frequently used measures for a "world utility" 
function for the collective is the variance a of the 
fluctuations in the time series of A{t). In the language of 
economics, it is the volatility of the market, and from a 
systems design point of view |20l | it is the quantity that 
we ultimately want to minimize. 

As mentioned before, this game has two types of 
quenched disorder embedded into it. A natural ques- 
tion then is if one can find/evolve networks that achieve 
zero, or almost zero volatility given a group and their 
strategies, or, alternatively, if one can find strategies 
that achieve zero, or near zero volatility, given a par- 
ticular substrate network. To answer this question, we 
performed simple random searches in one of the quenched 
disorder spaces (network or strategy) keeping the other 
quench disorder fixed (strategy or network). An exam- 
ple with in = 2 and m = 8 is displayed in Fig|2i) as a 
function of connectivity p. The first conclusion is that 
overall, the collective does worse with "smart agents" 
(large m) on highly connected networks if they exchange 
information about their strategies. However, in the low 
m regime (m = 2), the system efficiency can improve 
not only beyond that of the standard MG, but also be- 
yond that of the RCG without network (blue line with 
fflCG = 0.5-\/]V), and even beyond the standard MG's 
best performance (which is at a different value of to = 6 
for these parameters). Thus, a networked, low trait di- 
versity system can be more effective as a collective, than 
a sophisticated group. Note that the optimal p values 
are still much larger than the critical value for the giant 
component in the ER network, which is 1/N, and thus we 
need well connected single component graphs in order to 
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FIG. 2; Collective efficiency, a) shows the time-averaged 
volatility (over 5 x 10^ steps) of the market as a function of 
the substrate network connectivity parameter, p. The empty 
circles (m = 2) and the solid squares (m = 8) are obtained 
by fixing the strategy space disorder and taking randomly 50 
network samples, while the crosses (m = 2) and the diamonds 
(m = 8) are obtained with the network space disorder fixed 
for 50 strategy disorders. Here S — 2 and A'^ — 101. b) shows 
a sample time series in (green/gray) for one of the low lying 
points in a) at p = 0.1, m = 2. The black time series corre- 
sponds to a run for the ordinary MG at minimum volatility 
which is at m = 6, S = 2. The black curve has a variance of 
2.36, while the green/gray has a variance of 1.07. 



observe the collective efficiency emerge from the agent- 
agent interactions. However, the optimal values are ac- 
tually in the realistic range for social networks, giving 
for the average number of contacts A — pN ~ 10 — 20. 
If N is varied the optimum range for p shifts such that 
optimum value of pN remains constant. Figl^b) shows a 
sample time-series from the optimal connectivity region. 
Notice the low volatility compared to the best perfor- 
mance of the MG (in the background) . In the standard 
MG the variations in cr at the best performance point are 
low, and even an extended search (500 samples) in the 
strategy disorder space could not generate ct-s lower than 
2.0, while in contrast, time series such as the red one in 
FigEJs) are easily generated within 50 random samples 
in the optimal connectivity region. This emerging collec- 
tive efficiency can be understood in terms of the crowd- 
anticrowd description of the MG, as introduced by John- 
son, Hart and Hui 0. In the MG, low m means that 
only a small number of different strategies are possible, 
thus many agents are forced to use the same strategy and 
thus they behave as a crowd, or a group. This group- 
ing effect generates the large volatility in the ordinary 
MG. When the game is played on a network, however, 
an agent, even if it shares the same strategy as the oth- 
ers in a large group, now has the possibility to listen to 
some other agents, and possibly even from other groups. 
Thus, it is no longer forced to behave the same way as 
its own group, thereby breaking the grouping behavior. 
If, however, p is too large, there is a grouping behavior 
appearing due to the network, because an agent will have 
too many followers if his score is the highest, creating a 
group on the network. The two crowding effects compete 
and a balance between them is reached in the optimum 
connectivity region. 



In summary, we have shown that the evolution of 
multi-agent games can strongly depend on the nature of 
the agent's information resources, including local infor- 
mation gathered on the social network, a network whose 
structure in turn is influenced by the fate of the game 
itself. In our study, we allowed for this dynamic cou- 
pling between the game and the network by using rein- 
forcement learning as an ubiquitous mechanism for inter- 
agent communications. Our observations are: 1) if rein- 
forcement learning is used, a scale-free leadership struc- 
ture can be created, even on the backbone of non-scale 
free networks; 2) in low trait diversity collectives, en- 
hanced collective efficiency may appear, making this ef- 
fect worthwhile for systems design studies '20|. 
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